'*Video Conipession and Decompression Processing and Processors" (U.S. patent 5,379,351). 
These patents are incoiporated hoein by refo^ice. — _ 

. ^ 

At page 8, lines 24-31, and page 9, lines 1-5, plrase rq>lace the paragr^h as follows 
(chan ges are shown on tfie pag ffj g*ta/«hpH j^^^p-rfny 

It will be {predated fbst the double-talk detector 318 receives the transmit audio signal 
on line 342 after the echo has been canceled. This is because it is desirable to compare the 
received audio signal to the transmit audio signal wi&out the echo. In the case whCTe fliere is a 
strong coiq)ling betwem die speaker 322 and microphone 320 it may be difficult to determine the 
proper time at which to adjust the filter coefficients. An example scenario is where the speaker is 
placed near the microphone, and the filter is not yet converged If diere is silence at the near-end, 
and a &r-end audio signal is received (where **&r-end" refers to signals received by codec 324), 
the conditions are propo- to ad^t the filter. However, the double-talk detector will erroneously 
detect a near-end signal because the far-end signal fed back to the microphone is not canceled by 
the echo-cancellation circuitry. When the speaker and microphone are placed near one another, 
the double-talk detector may never find that it is appropriate to ad£q>t the coefficients, and 
therefore the coefficients will not converge to a useful state. 



At page 13, lines 12 - 30, and page 14, lines 1-7, please replace the paiagnq>h as follows 
fchangea are ahowi^ on the rntgen atffn^^pA ^ereto'>: 



For video-assisted double-talk detection, the estimated near-oid energy, E_near, is 
combined witii the moufli motion energy, Ejmotion, to calculate ttte probability of near-end 
silence P(silence|E_near, E_motion). This is accompUshed by calculating, according to the 
Bayes' Rule: 

P(silence|E_near, E_motion) = 

P(E_near|silence)*P(E_motion)silence) • P(silence)/(P(E_near)*P(E_motion)) 
P(E_near{silence) is the probability of observing the particular value of E_near in the case of 
near-end silence. These values are measured by a histogram technique prior to the operation of 
the system and stored in a look-iq) table. P(silence) is the probability of near-end silence and is 
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usually set to 1/2. P(E_near) is the probability of observing the particular value of E_near under 
all operating conditions, i.e., both with near-end silence AND near-end speech. These valura are 
also measured by a histogram technique prior to the operation of the systm and stored in a look- 
up table. In the same way, P(E_motion|silence) and P(E_motion) are measured prior to opoation 
of the system and stored in additional look-up tables. In a refined version of the double-talk 
detector, the tables for P(E_nearjsilence) and P(£_near) are replaced by multiple tables for 
different levels of the estimated values of ERLE. In this way, the di£ferent reUability levels for 
estimating E_near in dififerent states of convergence of filter 314 can be taken into account. The 
resulting probability P(silence|E_near, E_motion) is finally compared to a threshold to decide 
whether the condition of near-end silence is fulfilled that woiild allow a reliable, fiist adaptation 
of the filter 314 by adapts 316. la addition, the double-talk detector compares the short-term 
received audio energy E_receive with another threshold to determine whether fbexe is enough 
energy for reliable ad^tation. If both thresholds are exceeded, an adq>tation with a non-zero 
step-size by adapter 3 16 is enabled; otherwise the stq>-size is set to zero. 

At page IS, lines 3-17, please i^^^ce the paragraph as follows (changes are shown on 
the pages atta ched hereto): 

in yet another onbodimeut, the absence of detected mouth movement can be used to 
advantageously increase the video quality. For example, the hearing impaired may use 
videoconfoencing arrangements for communicating with sign language. Because sign language 
uses hand movement instead of sound, the channel devoted to audio may instead be used to 
increase the video fi^e rate, thoeby oihancing the quality of sign language trananitted via 
videoconferencing. Thus, if no mouth movement is detected, the syston may automatically 
make the necessary adjustments. A related patent is U.S. Patoit No. 6,404,776 issued on June 
1 1, 2002, oititled "Data Processor Having Controlled Scalable bput Data Source and M^od 
Thereof docket number 8X8S.15USI1, which is hereby incoiporated by reference. Other 
embodiments are contemplated as set forth in co-poiding U.S. Patent No. 6,124,882 issued on 
September 26, 2000, entitled "Videocommunicating ^aratus and Method Therefor" by Voois 
et al., as well as various video communicating circuit airangonenls and products, and their 
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